Robust binarization of degraded document images using heuristics

نویسندگان

  • Jon Parker
  • Ophir Frieder
  • Gideon Frieder
چکیده

Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, image enhancement method that is input page independent and requires no training data. The approach applies to color or greyscale images with hand written script, typewritten text, images, and mixtures thereof. We evaluated the image enhancement method against the test images provided by the 2011 Document Image Binarization Contest (DIBCO). Our method outperforms all 2011 DIBCO entrants in terms of average F1 measure – doing so with a significantly lower variance than top contest entrants. The capability of the proposed method is also illustrated using select images from a collection of historic documents stored at Yad Vashem Holocaust Memorial in Israel.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ancient Document Images Enhancement Using Phase Based Binarization

In this paper, we present a phase-based binarization model for degraded document images, also a post processing method that can improve any binarization method and a ground truth generation tool. Usually, many binarization techniques are implemented in the literature for different types of binarization problems. It include an adaptive image contrast based document image binarization technique t...

متن کامل

A Robust Document Image Binarization Technique for Degraded Document Images

Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intravariation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the loca...

متن کامل

Robust Document Image Binarization Technique for Degraded Document Images by using Morphological Operators

To make a robust document images from badly degraded images it is necessary to discriminating a text from background images but it is a very challenging task. There are so many binarization techniques can be used for making the document pictures reliable. But problem of thresholding and filtering cannot be solved. In the existing method, edge based segmentation can be done and Canny edge detect...

متن کامل

A Binarization Method for Degraded Document Images with Morphological Operations

In this paper, we propose an effective binarization method for de-graded document images in this paper. This method employs morphological operations throughout its algorithm to suppress uneven illumination in the background region, to detect the character location and to reconstruct text regions. Moreover, a technique for estimating stroke width of characters is introduced to remove noises in a...

متن کامل

Binarization of Document Image

Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014